Phase 1: connection-mode enum + backwards-compat (issue #5989)#6047
Phase 1: connection-mode enum + backwards-compat (issue #5989)#6047MichaelUray wants to merge 19 commits intonetbirdio:mainfrom
Conversation
…nfig Additive change for issue netbirdio#5989 Phase 1. New fields use new tag numbers (11, 12, 13); existing fields (including LazyConnectionEnabled tag 6) are unchanged so old clients ignore the additions and old servers send UNSPECIFIED, which the new client maps back via the legacy boolean. Note: the regenerated pb.go files now report protoc v5.29.3 in their header (this branch was generated with locally-installed protoc 29.3 instead of upstream's v7.34.1). Functionally identical; header diff is the only delta beyond the actual schema additions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Defines Mode enum (relay-forced, p2p, p2p-lazy, p2p-dynamic plus the client-only sentinels Unspecified and FollowServer), ParseString for CLI/env input, ToProto/FromProto for wire translation, and the two backwards-compat helpers ResolveLegacyLazyBool / ToLazyConnectionEnabled that bridge the old Settings.LazyConnectionEnabled boolean. Phase 1 of issue netbirdio#5989. Pure addition -- no existing callers touched in this commit; the engine/conn_mgr migration follows in subsequent commits in the same PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on warns NB_CONNECTION_MODE wins over the legacy pair (NB_FORCE_RELAY, NB_ENABLE_EXPERIMENTAL_LAZY_CONN); when the legacy pair is set together, NB_FORCE_RELAY wins (most-restrictive, mirrors the group-conflict rule from issue netbirdio#5990). Each legacy var emits a one-shot deprecation warning when it actually contributes to the resolved mode. NB_LAZY_CONN_INACTIVITY_THRESHOLD becomes an alias for the future relay_timeout setting and warns once. IsForceRelayed() is kept for callers that have not yet been migrated (conn.go, statusrecorder); they will be updated in the engine/conn refactor commits later in this PR. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new CLI flags map onto the new connection-mode plumbing: - --connection-mode <relay-forced|p2p|p2p-lazy|p2p-dynamic|follow-server> - --relay-timeout <seconds> - --p2p-timeout <seconds> Plumbed through three sites in cmd/up.go (SetConfigRequest, ConfigInput, LoginRequest), persisted in profilemanager.Config, and added as new fields on the daemon.proto IPC messages. Empty / not-changed flags fall back to the server-pushed value (which itself falls back to the legacy lazy_connection_enabled boolean for old servers). Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EngineConfig gains ConnectionMode, RelayTimeoutSeconds, P2pTimeoutSeconds. ConnMgr now stores the resolved Mode plus the raw inputs (env, config) so it can re-resolve when the server pushes a new PeerConfig. UpdatedRemoteFeatureFlag is renamed to UpdatedRemotePeerConfig and takes the full PeerConfig pointer; a thin shim with the old name delegates to it for callers that haven't been updated yet. connect.go copies the three new fields from profilemanager.Config into the EngineConfig builder, with a tolerant parser that logs and falls through to Unspecified on invalid input. Phase 1 of issue netbirdio#5989. peer/conn.go forwarding follows in C4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ConnConfig gains a Mode field forwarded from the engine. Open() now checks Mode == ModeRelayForced instead of calling the global env-reader IsForceRelayed(). The local 'forceRelay' variable name is renamed to 'skipICE' to make the new branching intent explicit. The PeerStateUpdate block at the end of Open() also reads from conn.config.Mode now, so the StatusRecorder sees the per-peer mode rather than the global env var. A single remaining caller of IsForceRelayed() (srWatcher.Start in engine.go) is left for a follow-up; that path uses a process-wide flag not per-peer state, so it can be migrated in Phase 2 once srWatcher itself learns about ConnectionMode. Phase 1 of issue netbirdio#5989. Engine forwarding (C5) follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
createPeerConn now reads ConnMgr.Mode() and copies it into peer.ConnConfig, so the per-peer Open() loop in conn.go can take the ModeRelayForced skip-ICE branch without reading the global env var. This is the last wiring commit for the client side of Phase 1; the server-side mgmt changes (Settings + OpenAPI + handler + audit + NetworkMap-build) follow in Section D. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All three fields are nullable to distinguish 'use built-in default' (NULL) from explicit values (incl. 0 = never tear down). Copy() now deep-clones the new pointer fields via two small helpers. GORM AutoMigrate creates the new columns at first start; existing accounts have NULL in all three columns and resolve via the legacy LazyConnectionEnabled boolean. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tings Three new optional, nullable fields with descriptions of the NULL = built-in-default semantics and the Phase-1-vs-Phase-2 status of p2p-dynamic. Regenerated types.gen.go via the existing oapi-codegen tooling. The generated AccountSettingsConnectionMode enum has the canonical values relay-forced / p2p / p2p-lazy / p2p-dynamic, plus a Valid() helper for handler-side validation. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PUT /api/accounts/{id} now accepts connection_mode (validated against
the four-value enum via the generated AccountSettingsConnectionMode.
Valid()), p2p_timeout_seconds and relay_timeout_seconds. NULL in the
JSON body keeps the existing value untouched (= "no client-side
override on this round-trip"); explicit NULL-clear via API uses a
distinct PATCH-style call which is out-of-scope for Phase 1.
Response payload mirrors the input fields back as nullable so the
dashboard can distinguish "use default" from "explicit value".
Phase 1 of issue netbirdio#5989. Audit-event emission follows in D5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AccountConnectionModeChanged (121), AccountRelayTimeoutChanged (122), AccountP2pTimeoutChanged (123) -- emitted from account.go when settings change. Per-peer / per-group event codes are reserved for Phase 3 (issue netbirdio#5990). Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
handleConnectionModeSettings is invoked from the same diff-detection block as handleLazyConnectionSettings; emits one StoreEvent per changed field (ConnectionMode, RelayTimeoutSeconds, P2pTimeoutSeconds) with old/new values in the meta payload. Four small ptr-equality / deref helpers are added for nullable string and uint32 fields. They are package-private and named after the existing convention used elsewhere in the package. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…elds Two changes in one commit because they're inseparable: 1. Move client/internal/peer/connectionmode/ to shared/connectionmode/. The package now needs to be importable from BOTH client/ and management/ (which is impossible while it lives under client/internal/ per Go's internal-package rule). All imports updated; tests pass on both sides. 2. Extend management/internals/shared/grpc/conversion.go::toPeerConfig to populate the three new PeerConfig fields (ConnectionMode, P2PTimeoutSeconds, RelayTimeoutSeconds) using the connectionmode helpers. The legacy LazyConnectionEnabled boolean is now derived from the resolved Mode via ToLazyConnectionEnabled() rather than copied verbatim from Settings -- this is the central backwards-compat contract: old clients see only the boolean, new clients prefer the explicit enum and ignore the bool. Resolution rules (Phase 1, account-wide only): - Settings.ConnectionMode != nil and parses -> wins - Otherwise -> ResolveLegacyLazyBool(LazyConnectionEnabled) - timeouts: Settings.RelayTimeoutSeconds / P2pTimeoutSeconds when non-NULL, else 0 (= server has no preference; client uses built-in default) Per-peer / per-group resolution comes in Phase 3 (netbirdio#5990). Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nine sub-cases cover the Phase-1 resolution matrix from spec section 3: - no settings -> default (P2P + lazy=false) - legacy bool only -> mapped via ResolveLegacyLazyBool - explicit ConnectionMode -> wins over the legacy bool - timeouts propagate - garbage ConnectionMode value -> tolerant fallback to legacy bool Particular attention to the structural compat gap: relay-forced cannot be expressed via the legacy boolean, so the wire field for old clients is sent as false. Documented in the spec, asserted here. Existing TestAccount_GetPeerNetworkMap remains green: existing test peers have ConnectionMode=NULL in Settings, falls through to the legacy ResolveLegacyLazyBool(false) -> ModeP2P -> wire bool false. Phase 1 of issue netbirdio#5989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a connection-mode enum and per-peer timeout overrides end-to-end: CLI flags, env/profile resolution, client conn manager/engine/peer changes, management API/proto/storage updates, and server audit/activity handling for connection-mode and timeout settings. ChangesConnection Mode & Timeout End-to-End
Sequence DiagramsequenceDiagram
actor User
participant CLI as Client CLI
participant Env as Env/Profile
participant ConnMgr as Connection Manager
participant Engine as Engine
participant Peer as Peer Conn
participant Server as Management Server
User->>CLI: provide flags/env/profile
CLI->>Engine: start / login (includes flags when changed)
Env->>ConnMgr: ResolveModeFromEnv()
CLI->>ConnMgr: profile/config inputs
ConnMgr->>ConnMgr: resolveConnectionMode(env → config → server)
ConnMgr->>ConnMgr: set Mode(), RelayTimeout()
Server->>Engine: updateNetworkMap() (sends PeerConfig w/ ConnectionMode + timeouts)
Engine->>ConnMgr: UpdatedRemotePeerConfig(pc)
ConnMgr->>ConnMgr: re-resolve mode (env → config → server)
ConnMgr->>ConnMgr: start/stop lazy manager on state change
Engine->>Peer: Create connection with ConnConfig.Mode and timeouts
Peer->>Peer: conditionally skip ICE/relay logic per Mode
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/cmd/up.go`:
- Around line 437-439: The connection-mode flag value must be normalized before
persisting or sending: when the flag value (connectionMode variable read via
connectionModeFlag) equals "relay" (case-insensitive, trimmed) convert it to the
canonical "relay-forced" string and then assign that normalized value to
req.ConnectionMode and any profile write paths; leave other values unchanged and
ensure the same normalization is applied in the other occurrences that set
req.ConnectionMode or write the profile (the other blocks referenced around the
same flag usage).
In `@client/internal/peer/env_test.go`:
- Line 25: Update the test case name string in the table entry that currently
reads "connection_mode unparseable falls through to legacy" to use the correct
spelling "connection_mode unparsable falls through to legacy"; locate the tuple
in peer/env_test.go (the table entry with values "garbage", "true",
connectionmode.ModeRelayForced, etc.) and change the first element only to keep
codespell green.
In `@client/internal/peer/env.go`:
- Around line 75-77: Validate the parsed duration from time.ParseDuration(raw)
before assigning to timeoutSecs: ensure the duration is non-negative and that
d.Seconds() does not exceed the maximum uint32 value (so it won't wrap when
cast). Specifically, in the block handling envInactivityThreshold, check d >= 0
and d.Seconds() <= float64(math.MaxUint32) (or compare against
time.Duration(maxUint32Seconds)*time.Second) and only then set timeoutSecs =
uint32(d.Seconds()); otherwise log or handle the invalid value (e.g., ignore the
env value or clamp to max) while still calling
warnDeprecated(envInactivityThreshold, "the relay_timeout setting on the
management server").
In `@client/internal/profilemanager/config.go`:
- Around line 177-179: The timeout fields RelayTimeoutSeconds and
P2pTimeoutSeconds should be made nullable so the persisted profile can
distinguish "unset" from an explicit zero: change their types from uint32 to
*uint32 in the profile struct(s) (e.g., the struct containing ConnectionMode,
RelayTimeoutSeconds, P2pTimeoutSeconds) and keep json:",omitempty" so nil is
omitted while an explicit zero is preserved when set; update any code that
reads/writes these fields (decoders, defaults, and consumers) to handle nil vs
non-nil dereferencing and apply the same change for the corresponding
p2p_timeout_seconds occurrences referenced around lines 604-618.
In `@management/server/account.go`:
- Around line 372-375: UpdateAccountSettings currently doesn't mark
updateAccountPeers or bump the network serial when ConnectionMode,
RelayTimeoutSeconds, or P2pTimeoutSeconds change, so peers won't receive the new
values; inside UpdateAccountSettings (near the existing updateAccountPeers
gating above the calls to am.handleRoutingPeerDNSResolutionSettings /
handleLazyConnectionSettings / handleConnectionModeSettings /
handlePeerLoginExpirationSettings) detect differences on ConnectionMode,
RelayTimeoutSeconds, and P2pTimeoutSeconds between oldSettings and newSettings,
set updateAccountPeers = true (or equivalent), and ensure the account/network
serial is bumped so connected peers are notified of the change.
In `@management/server/http/handlers/accounts/accounts_handler.go`:
- Around line 229-236: The code casts signed API timeout fields
(req.Settings.P2pTimeoutSeconds, req.Settings.RelayTimeoutSeconds) straight to
uint32 which will silently wrap on negative or out-of-range values; add explicit
validation before casting: check each non-nil value is >= 0 and <=
math.MaxUint32 (or 4294967295), and if not return a 400 validation error (use
the same error/response helper used elsewhere in accounts_handler.go). Implement
the checks inline where returnSettings.P2pTimeoutSeconds and
returnSettings.RelayTimeoutSeconds are set (or extract to a small helper
validateTimeout(name string, v *int64) error) and only perform the uint32
conversion after the value passes validation.
In `@shared/management/http/api/types.gen.go`:
- Around line 1517-1521: The handler currently casts pointer timeout fields like
P2pTimeoutSeconds (and the other *int64 timeout fields) directly to uint32,
which allows negative values to wrap; update the account handler code that
performs the cast to first check for nil, then validate the pointed int64 is >=
0 and <= math.MaxUint32, returning a validation error (or reject the request) if
out of range; only then convert safely to uint32. Identify the conversion sites
in the accounts handler where P2pTimeoutSeconds (and the similar timeout fields)
are dereferenced/cast and apply this guard before performing the uint32 cast.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a58c2ed6-4fbf-4f05-a219-4fc44e13ce33
⛔ Files ignored due to path filters (4)
client/proto/daemon.pb.gois excluded by!**/*.pb.goclient/proto/daemon_grpc.pb.gois excluded by!**/*.pb.goshared/management/proto/management.pb.gois excluded by!**/*.pb.goshared/management/proto/proxy_service.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (23)
client/cmd/root.goclient/cmd/up.goclient/internal/conn_mgr.goclient/internal/conn_mgr_test.goclient/internal/connect.goclient/internal/engine.goclient/internal/lazyconn/env.goclient/internal/peer/conn.goclient/internal/peer/env.goclient/internal/peer/env_test.goclient/internal/profilemanager/config.goclient/proto/daemon.protomanagement/internals/shared/grpc/conversion.gomanagement/internals/shared/grpc/conversion_test.gomanagement/server/account.gomanagement/server/activity/codes.gomanagement/server/http/handlers/accounts/accounts_handler.gomanagement/server/types/settings.goshared/connectionmode/mode.goshared/connectionmode/mode_test.goshared/management/http/api/openapi.ymlshared/management/http/api/types.gen.goshared/management/proto/management.proto
…changes UpdateAccountSettings only set updateAccountPeers=true when one of a fixed list of fields changed (RoutingPeerDNSResolutionEnabled, LazyConnectionEnabled, DNSDomain, AutoUpdate*). The Phase 1 ConnectionMode plus RelayTimeoutSeconds and P2pTimeoutSeconds were omitted, so a settings change to any of these was persisted and audited but never bumped the account network serial -- connected peers kept running with the old values until an unrelated change or reconnect picked them up. Add the three Phase 1 fields to the gating block using the existing equalStringPtr / equalUint32Ptr helpers (already used a few lines down in handleConnectionModeSettings for audit emission). Test: TestDefaultAccountManager_UpdateAccountSettings_ConnectionModeIncrementsSerial verifies that: - re-applying identical settings does NOT increment Network.Serial - changing ConnectionMode increments Serial - changing RelayTimeoutSeconds increments Serial - changing P2pTimeoutSeconds increments Serial Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ode null semantics parseAccountSettings narrowed *int64 timeout fields straight to uint32 without checking either end of the range. Negative values wrapped to a huge positive uint32, and values above math.MaxUint32 silently truncated. Both became persisted account state. Add validateUint32Timeout(field, ptr) -> (*uint32, error) and use it for p2p_timeout_seconds and relay_timeout_seconds. Returns a clear "%s must be >= 0 / <= MaxUint32" error so the API caller sees the problem instead of getting a corrupted setting saved. Also correct the ConnectionMode block comment: the previous wording implied JSON null could clear an existing override. The *api.AccountSettingsConnectionMode pointer cannot distinguish absent from explicit null at this layer, so both leave the stored value untouched. Document that and flag the API gap for a future revision. Test: TestValidateUint32Timeout covers nil, zero, common positive, MaxUint32 boundary, negative-rejected, and over-MaxUint32 rejected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…spell
ResolveModeFromEnv parsed NB_LAZY_CONN_INACTIVITY_THRESHOLD with
time.ParseDuration and then narrowed straight to uint32 via
uint32(d.Seconds()). Two ways that silently produced wrong state:
- negative durations (e.g. "-30s") wrapped to a huge positive uint32
- durations larger than ~136 years truncated when seconds overflowed
uint32
Reject both with a warn-and-ignore so the daemon falls back to the
server-pushed value or the built-in default instead of running with a
nonsensical relay timeout.
Test additions:
- "unparsable inactivity duration is ignored" (existing case renamed
from "unparseable" so codespell doesn't trip on the s/ss spelling)
- "negative inactivity duration is ignored"
- "oversized inactivity duration is ignored" (2000000h, > MaxUint32 s)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…int32
The profile Config struct held both timeout fields as
`uint32 \`json:",omitempty"\`` which conflated three states the rest
of the code already cares about:
- "no local override, follow whatever the server pushes"
- "explicit zero" (a future "disable this timer" sentinel)
- "explicit non-zero N seconds"
omitempty drops zero values on save, so a user who explicitly typed
`--relay-timeout 0` would see that intent erased the next time the
profile JSON was rehydrated.
Switch the persistence type to `*uint32`. ConfigInput already used
pointer types so the input-side semantics stay the same; the new
nil/value distinction is now preserved across save/load. connect.go
flattens to the runtime uint32 expected by EngineConfig via a small
derefUint32 helper -- the resolve chain in conn_mgr.go still uses 0
as "fall through" and is left untouched here, so a profile with
explicit-zero is not yet honored at runtime, but the on-disk data
is at least honest for when that resolve gets nullable-aware.
Test:
TestProfileTimeoutsNullableRoundtrip JSON marshals and unmarshals a
matrix of {nil, explicit zero, non-zero} combinations and verifies
nil-ness plus value survives the roundtrip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ParseString only accepted the canonical "relay-forced" form, even though older CLI docs and the dashboard surface "Relay" as the user- facing label. A user passing --connection-mode relay got back an "unknown connection mode" error and silently fell through to ModeUnspecified, which was confusing given the spec uses both names interchangeably. Map both "relay-forced" and "relay" to ModeRelayForced. The canonical persisted form is unchanged (Mode.String() still returns "relay-forced"); the alias is only honored on parse so a config save + reload normalises to canonical. Tests cover the alias plus the existing case-insensitive and whitespace-tolerant behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/internal/peer/env.go`:
- Around line 55-61: The env parsing branch for EnvKeyNBConnectionMode currently
accepts connectionmode.ModeFollowServer; update the logic after calling
connectionmode.ParseString(raw) (used in this block and the variable mode) to
treat connectionmode.ModeFollowServer the same as
connectionmode.ModeUnspecified: either set mode to parsed only if parsed !=
connectionmode.ModeUnspecified && parsed != connectionmode.ModeFollowServer, or
log a warning and ignore parsed when parsed == connectionmode.ModeFollowServer
so it falls through to server/default resolution; keep using
connectionmode.ParseString and the existing mode variable and warn with
log.Warnf when ignoring follow-server.
In `@management/server/account.go`:
- Around line 462-466: The audit metadata currently collapses nil and 0 for
RelayTimeoutSeconds and P2pTimeoutSeconds which makes transitions ambiguous;
update handleConnectionModeSettings (and any code building the meta payload for
connection-mode changes) to preserve the null state explicitly by storing nil in
the metadata (or include an explicit boolean flag like
RelayTimeoutSecondsIsNil/P2pTimeoutSecondsIsNil) rather than coercing nil to 0,
and ensure both old and new values in the meta payload reflect nil vs numeric
values so audit events show "nil -> 0" correctly.
In `@management/server/http/handlers/accounts/accounts_handler.go`:
- Around line 29-41: The validation currently returns plain fmt.Errorf which
bypasses the handler's status mapping; update validateUint32Timeout to return
structured gRPC-like errors using status.Errorf(status.InvalidArgument, ...) for
both the "<field> must be >= 0" and "<field> must be <= ..." branches (replace
fmt.Errorf calls), and do the same for the connection_mode validation referred
to around the connection_mode checks (lines that validate connection_mode) so
those branches return status.Errorf(status.InvalidArgument, ...) with the same
messages; keep the function signature and returned types unchanged
(validateUint32Timeout, and the connection_mode validation logic).
- Around line 243-266: The handler currently treats omitted and explicit JSON
null the same because req.Settings uses plain pointer fields (connection_mode,
P2pTimeoutSeconds, RelayTimeoutSeconds), so the PUT cannot clear stored
overrides; change to presence-aware decoding (e.g., make the request Settings
struct use json.RawMessage or a small nullable wrapper type that records
"present" vs "absent", or add explicit boolean "Has..." flags) and update the
handler logic around validateUint32Timeout, returnSettings and connection_mode
handling to branch on presence: if the field is present and null -> set the
corresponding returnSettings.* to nil to clear the override; if present and
non-null -> validate/assign as now; if absent -> leave existing value untouched.
Ensure the handler inspects the new presence-aware symbols (the Settings
raw/json fields or HasP2pTimeoutSeconds/HasConnectionMode flags) rather than
relying on pointer nil to decide whether to clear vs omit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0f25c092-a93d-47d1-b2ed-c1746971259d
📒 Files selected for processing (11)
client/internal/connect.goclient/internal/peer/env.goclient/internal/peer/env_test.goclient/internal/profilemanager/config.goclient/internal/profilemanager/config_test.gomanagement/server/account.gomanagement/server/account_test.gomanagement/server/http/handlers/accounts/accounts_handler.gomanagement/server/http/handlers/accounts/accounts_handler_test.goshared/connectionmode/mode.goshared/connectionmode/mode_test.go
✅ Files skipped from review due to trivial changes (3)
- shared/connectionmode/mode.go
- shared/connectionmode/mode_test.go
- client/internal/peer/env_test.go
| if raw := os.Getenv(EnvKeyNBConnectionMode); raw != "" { | ||
| parsed, err := connectionmode.ParseString(raw) | ||
| if err != nil { | ||
| log.Warnf("ignoring %s=%q: %v", EnvKeyNBConnectionMode, raw, err) | ||
| } else if parsed != connectionmode.ModeUnspecified { | ||
| mode = parsed | ||
| } |
There was a problem hiding this comment.
Reject follow-server in env resolution.
connectionmode.ModeFollowServer is a config-only sentinel, but this branch currently returns it for NB_CONNECTION_MODE. client/internal/conn_mgr.go only strips that sentinel from config values, so NB_CONNECTION_MODE=follow-server becomes the final mode instead of falling through to server/default resolution. Treat it as ModeUnspecified here, or ignore it with a warning.
Suggested fix
if raw := os.Getenv(EnvKeyNBConnectionMode); raw != "" {
parsed, err := connectionmode.ParseString(raw)
if err != nil {
log.Warnf("ignoring %s=%q: %v", EnvKeyNBConnectionMode, raw, err)
+ } else if parsed == connectionmode.ModeFollowServer {
+ log.Warnf("ignoring %s=%q: follow-server is only valid in client config", EnvKeyNBConnectionMode, raw)
} else if parsed != connectionmode.ModeUnspecified {
mode = parsed
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@client/internal/peer/env.go` around lines 55 - 61, The env parsing branch for
EnvKeyNBConnectionMode currently accepts connectionmode.ModeFollowServer; update
the logic after calling connectionmode.ParseString(raw) (used in this block and
the variable mode) to treat connectionmode.ModeFollowServer the same as
connectionmode.ModeUnspecified: either set mode to parsed only if parsed !=
connectionmode.ModeUnspecified && parsed != connectionmode.ModeFollowServer, or
log a warning and ignore parsed when parsed == connectionmode.ModeFollowServer
so it falls through to server/default resolution; keep using
connectionmode.ParseString and the existing mode variable and warn with
log.Warnf when ignoring follow-server.
| // handleConnectionModeSettings emits one audit event per changed Phase-1 | ||
| // connection-mode setting (mode, relay timeout, p2p timeout). Each event | ||
| // carries old/new values in the meta payload so administrators can audit | ||
| // the full transition. NULL transitions show as empty string / 0 in the | ||
| // meta — chosen over a sentinel so the frontend can render uniformly. |
There was a problem hiding this comment.
Preserve NULL versus 0 in timeout audit events.
RelayTimeoutSeconds and P2pTimeoutSeconds use nil for “built-in default”, but 0 is also a valid configured value. Collapsing both to 0 makes the audit trail ambiguous; a nil -> 0 change is currently logged as 0 -> 0. Store nil explicitly in the metadata or add null-state flags alongside the numeric values.
Suggested direction
+func auditUint32Meta(p *uint32) any {
+ if p == nil {
+ return nil
+ }
+ return *p
+}
+
func (am *DefaultAccountManager) handleConnectionModeSettings(ctx context.Context, oldSettings, newSettings *types.Settings, userID, accountID string) {
…
if !equalUint32Ptr(oldSettings.RelayTimeoutSeconds, newSettings.RelayTimeoutSeconds) {
am.StoreEvent(ctx, userID, accountID, accountID, activity.AccountRelayTimeoutChanged, map[string]any{
- "old": derefUint32Ptr(oldSettings.RelayTimeoutSeconds),
- "new": derefUint32Ptr(newSettings.RelayTimeoutSeconds),
+ "old": auditUint32Meta(oldSettings.RelayTimeoutSeconds),
+ "new": auditUint32Meta(newSettings.RelayTimeoutSeconds),
+ "old_is_null": oldSettings.RelayTimeoutSeconds == nil,
+ "new_is_null": newSettings.RelayTimeoutSeconds == nil,
})
}
if !equalUint32Ptr(oldSettings.P2pTimeoutSeconds, newSettings.P2pTimeoutSeconds) {
am.StoreEvent(ctx, userID, accountID, accountID, activity.AccountP2pTimeoutChanged, map[string]any{
- "old": derefUint32Ptr(oldSettings.P2pTimeoutSeconds),
- "new": derefUint32Ptr(newSettings.P2pTimeoutSeconds),
+ "old": auditUint32Meta(oldSettings.P2pTimeoutSeconds),
+ "new": auditUint32Meta(newSettings.P2pTimeoutSeconds),
+ "old_is_null": oldSettings.P2pTimeoutSeconds == nil,
+ "new_is_null": newSettings.P2pTimeoutSeconds == nil,
})
}
}Also applies to: 474-483
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/account.go` around lines 462 - 466, The audit metadata
currently collapses nil and 0 for RelayTimeoutSeconds and P2pTimeoutSeconds
which makes transitions ambiguous; update handleConnectionModeSettings (and any
code building the meta payload for connection-mode changes) to preserve the null
state explicitly by storing nil in the metadata (or include an explicit boolean
flag like RelayTimeoutSecondsIsNil/P2pTimeoutSecondsIsNil) rather than coercing
nil to 0, and ensure both old and new values in the meta payload reflect nil vs
numeric values so audit events show "nil -> 0" correctly.
| func validateUint32Timeout(field string, ptr *int64) (*uint32, error) { | ||
| if ptr == nil { | ||
| return nil, nil | ||
| } | ||
| v := *ptr | ||
| if v < 0 { | ||
| return nil, fmt.Errorf("%s must be >= 0, got %d", field, v) | ||
| } | ||
| if v > int64(math.MaxUint32) { | ||
| return nil, fmt.Errorf("%s must be <= %d, got %d", field, uint32(math.MaxUint32), v) | ||
| } | ||
| out := uint32(v) | ||
| return &out, nil |
There was a problem hiding this comment.
Return structured invalid-argument errors here.
These branches build plain fmt.Errorf values for client validation failures, so they bypass the handler’s normal status.InvalidArgument path and surface as generic server errors instead of a 4xx. Use status.Errorf(status.InvalidArgument, …) for both timeout bounds and connection_mode validation.
Suggested fix
func validateUint32Timeout(field string, ptr *int64) (*uint32, error) {
…
if v < 0 {
- return nil, fmt.Errorf("%s must be >= 0, got %d", field, v)
+ return nil, status.Errorf(status.InvalidArgument, "%s must be >= 0, got %d", field, v)
}
if v > int64(math.MaxUint32) {
- return nil, fmt.Errorf("%s must be <= %d, got %d", field, uint32(math.MaxUint32), v)
+ return nil, status.Errorf(status.InvalidArgument, "%s must be <= %d, got %d", field, uint32(math.MaxUint32), v)
}
…
}
if req.Settings.ConnectionMode != nil {
modeStr := string(*req.Settings.ConnectionMode)
if !req.Settings.ConnectionMode.Valid() {
- return nil, fmt.Errorf("invalid connection_mode %q", modeStr)
+ return nil, status.Errorf(status.InvalidArgument, "invalid connection_mode %q", modeStr)
}
…
}Also applies to: 238-241
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/http/handlers/accounts/accounts_handler.go` around lines 29
- 41, The validation currently returns plain fmt.Errorf which bypasses the
handler's status mapping; update validateUint32Timeout to return structured
gRPC-like errors using status.Errorf(status.InvalidArgument, ...) for both the
"<field> must be >= 0" and "<field> must be <= ..." branches (replace fmt.Errorf
calls), and do the same for the connection_mode validation referred to around
the connection_mode checks (lines that validate connection_mode) so those
branches return status.Errorf(status.InvalidArgument, ...) with the same
messages; keep the function signature and returned types unchanged
(validateUint32Timeout, and the connection_mode validation logic).
| // Persist as the canonical string. Note: the request type uses a | ||
| // non-pointer `*api.AccountSettingsConnectionMode` and JSON null | ||
| // is indistinguishable from an absent field at this layer -- both | ||
| // land here as a nil pointer and skip this whole block, leaving | ||
| // the existing value untouched. There is currently no API path | ||
| // that lets a client *clear* an explicit override; the next | ||
| // settings revision should switch to a sentinel-aware wrapper. | ||
| s := modeStr | ||
| returnSettings.ConnectionMode = &s | ||
| } | ||
| if req.Settings.P2pTimeoutSeconds != nil { | ||
| v, err := validateUint32Timeout("p2p_timeout_seconds", req.Settings.P2pTimeoutSeconds) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| returnSettings.P2pTimeoutSeconds = v | ||
| } | ||
| if req.Settings.RelayTimeoutSeconds != nil { | ||
| v, err := validateUint32Timeout("relay_timeout_seconds", req.Settings.RelayTimeoutSeconds) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| returnSettings.RelayTimeoutSeconds = v | ||
| } |
There was a problem hiding this comment.
This endpoint still can't clear the new nullable overrides.
connection_mode, p2p_timeout_seconds, and relay_timeout_seconds all collapse “field omitted” and “explicit JSON null” into the same nil value here. That means once one of these settings is stored, this PUT handler has no way to reset it back to NULL/built-in-default semantics, which breaks the nullable contract introduced in this PR. This needs presence-aware decoding rather than plain pointer fields.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/http/handlers/accounts/accounts_handler.go` around lines
243 - 266, The handler currently treats omitted and explicit JSON null the same
because req.Settings uses plain pointer fields (connection_mode,
P2pTimeoutSeconds, RelayTimeoutSeconds), so the PUT cannot clear stored
overrides; change to presence-aware decoding (e.g., make the request Settings
struct use json.RawMessage or a small nullable wrapper type that records
"present" vs "absent", or add explicit boolean "Has..." flags) and update the
handler logic around validateUint32Timeout, returnSettings and connection_mode
handling to branch on presence: if the field is present and null -> set the
corresponding returnSettings.* to nil to clear the override; if present and
non-null -> validate/assign as now; if absent -> leave existing value untouched.
Ensure the handler inspects the new presence-aware symbols (the Settings
raw/json fields or HasP2pTimeoutSeconds/HasConnectionMode flags) rather than
relying on pointer nil to decide whether to clear vs omit.




Summary
This PR is Phase 1 of three that together implement issues #5989 and #5990 (the connection-mode consolidation RFC and its server-side per-peer/per-group companion).
In Scope:
ConnectionModeproto enum with four reserved values (relay-forced,p2p,p2p-lazy,p2p-dynamic); three are functional in this PR.p2p-dynamicis reserved on the wire and in the DB but the daemon currently treats it likep2p(pass-through). The decoupledworker_relay/worker_icelifecycle that makesp2p-dynamicdistinct comes in Phase 2.relay_timeout_seconds,p2p_timeout_seconds(both nullable, NULL = built-in default).client/internal/conn_mgr.gowith a single resolver: client env > client config > server-pushed.lazy_connection_enabledboolean (mapped from the resolved mode bytoPeerConfig).connection_mode = UNSPECIFIED(proto default 0) and the new client falls back to the legacy boolean -- verified via tests.NB_FORCE_RELAY,NB_ENABLE_EXPERIMENTAL_LAZY_CONN,--enable-lazy-connection, andNB_LAZY_CONN_INACTIVITY_THRESHOLDcontinue to work and emit one-shot deprecation warnings.The companion Dashboard PR is at netbirdio/dashboard#627 (will be added once it exists).
Implementation map
shared/management/proto/management.proto,client/proto/daemon.protoshared/connectionmode/(Mode, ParseString, FromProto, ToProto, ResolveLegacyLazyBool, ToLazyConnectionEnabled)client/internal/peer/env.go(ResolveModeFromEnv),client/cmd/{root,up}.go(--connection-mode,--relay-timeout,--p2p-timeout)client/internal/{engine,conn_mgr,connect}.go(resolveConnectionMode + UpdatedRemotePeerConfig + EngineConfig fields)client/internal/peer/conn.go(Mode-driven skip-ICE branch)management/server/types/settings.go(new nullable columns),management/server/http/handlers/accounts/accounts_handler.go(PUT validation + response),management/server/account.go(audit emission),management/server/activity/codes.go(3 new event codes),management/internals/shared/grpc/conversion.go(toPeerConfigwrites both old and new wire fields)shared/management/http/api/openapi.yml+ regeneratedtypes.gen.goshared/connectionmode/mode_test.go,client/internal/peer/env_test.go,client/internal/conn_mgr_test.go,management/internals/shared/grpc/conversion_test.goHardware-tested
Verified on a real production NetBird instance with 32 connected peers across 12 OpenWrt-router versions (22.03 to 25.12), Windows 10/11, Debian, Android 12/14, and iOS 26.3.1:
lazy_connection_enabledboolean whichtoPeerConfigkeeps writing via the back-compat mapping).connection_modebetweenp2pandp2p-lazyvia the API: 32 connected throughout, no disconnect.The daemon-side change is exercised end-to-end by switching the account-wide setting and observing the daemon pick up the new mode on the next NetworkMap update (handled by
UpdatedRemotePeerConfig).Known limitations (called out in spec section 8)
relay-forcedcannot be pushed to old clients because the legacylazy_connection_enabledboolean cannot express it. The wire field falls back tofalseand old clients run in normalp2p. Workaround: upgrade the client, or set localNB_FORCE_RELAY=true.p2p-dynamicis not visible in the Dashboard dropdown in Phase 1 even though the API accepts it; setting it via API givesp2pbehaviour until the Phase-2 daemon implementation lands.Phase 2 / 3 follow-ups
p2p-dynamicdaemon implementation -- decoupleworker_relay/worker_iceOnNewOfferregistration, two-tier inactivity manager, plus theDeactivatePeerno-op fix inconn_mgr.gothat Consolidate connection-mode flags; add p2p-dynamic and p2p-dynamic-lazy modes #5989 calls out for the lazy/eager mismatch.connection_mode,p2p_timeout,relay_timeoutwith most-restrictive group-conflict resolution and per-scope audit events.Test plan
client/internal/...andmanagement/...greenTestAccount_GetPeerNetworkMap(existing) remains green -- old peers withoutconnection_moderesolve viaResolveLegacyLazyBoolexactly as beforeTestToPeerConfig_ConnectionModeResolution(new, 9 sub-cases) covers the resolution matrixTestResolveConnectionMode(new, 10 sub-cases) covers the env > config > server precedence chainTestResolveModeFromEnv(new, 9 sub-cases) covers the env-var matrix including the legacy-conflict case🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests